Fast and Accurate Language Detection in Short Texts using Contextual Entropy

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and Accurate Language Detection in Short Texts using Contextual Entropy

In this work we address the problem of Language identification (LI) on short segments of text. The central idea is to compute the entropy of a document in different contexts and assign it to the category where the entropy is maximal. Only word distributions are needed for the task, no other training is done. For LI the contexts are the languages, and classification is done by just evaluating th...

متن کامل

Change-Detection Using Contextual Information and Fuzzy Entropy Principle

This paper presents an unsupervised change detection method for computing the amount of changes that have occurred within an area by using remotely sensed technologies and fuzzy modeling. The discussion concentrates on the formulation of a standard procedure that, using the concept of fuzzy sets and fuzzy logic, can define the likelihood of changes detected from remotely sensed data. The fuzzy ...

متن کامل

Statistical Language Identification of Short Texts

Although correctly identifying the language of short texts should prove useful in a large number of applications, few satisfactory attemps are reported in the literature. In this paper we describe a Naive Bayes Classifier that performs well on very short texts, as well as the corpus that we created from movie subtitles for training it. Both the corpus and the algorithm are available under the G...

متن کامل

Off-topic essay detection using short prompt texts

Our work addresses the problem of predicting whether an essay is off-topic to a given prompt or question without any previouslyseen essays as training data. Prior work has used similarity between essay vocabulary and prompt words to estimate the degree of ontopic content. In our corpus of opinion essays, prompts are very short, and using similarity with such prompts to detect off-topic essays y...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Research in Computing Science

سال: 2015

ISSN: 1870-4069

DOI: 10.13053/rcs-90-1-27